Whole-Blood Gene Expression Profile After Hypoxic-Ischemic Encephalopathy

This case-control study compares the genome expression at birth and associated outcome in neonates with hypoxic-ischemic encephalopathy in a high-income country (Italy) and their counterparts in low- to middle-income countries (India, Sri Lanka, and Bangladesh).

The RNA samples from the HIC cohort were sequenced (Next generation sequencing) on an Illumina HiSeq4000 to generate 50M reads/sample (Imperial BRC Genomics Facility, Imperial College London).
The RNA samples from the LMIC cohort were sequenced in two separate batches; 55 on a Hiseq150 PE (Genotypic Technology Facility, Bangalore, India) and 44 on an Illumina Hiseq2500 (Medgenome Labs Ltd, Bangalore, India), both with 30M reads/sample 12 .The RNA from 10 neonates were sequenced twice for internal quality check and consistency.The clinical features of the infants in batch 1 and 2 were similar (etable 1).

Analyses
Analyses were carried out separately on samples with good quality RNA by following the same protocol in each cohort.Data was analyzed using 'R' Language and Environment for Statistical Computing (R) (v4.1.0).Principal Component Analysis (PCA) was used as part of the quality control process.The LMIC samples were first merged and any batch effect was then adjusted for using Combat-seq (sva v3.40.0).After batch correction and normalisation, PCA confirmed that batch effects had been removed (eFigure 1).Any sources of variance based on gestational age, gender and birth weight were also assessed (eFigure 2, 3).
We estimated normalisation factors and negative binomial dispersions from the raw count data.With these estimates, we adjusted negative binomial generalised linear models for each gene by conducting likelihood tests.The differential expression analysis was performed by using DESeq2 (v1.32.0) first unadjusted and then adjusted for for birth weight, gender and gestational age.For the LMIC dataset, the DESeq2 model was adjusted also for treatment (hypothermia or usual care) as well as for birth weight, gender and gestational age.All the neonates in HIC with HIE underwent therapeutic hypothermia.Therefore, no adjustment for treatment was performed.The p-values obtained were corrected for multiple testing using the Benjamini-Hochberg false discovery rate (FDR) method.As we wanted to investigate whether there was any overlap in the differentially expressed genes between the two datasets (LMIC and HIC), we repeated the differential expression analysis by using only the genes with non-zero counts and which were in common between LMIC and HIC.
MaSigPro was used to perform a two-step regression model to identify significant genes differentially expressed over the time between a) Neonates with HIE and with adverse outcome versus those with good outcome and b) Neonates with HIE versus healthy controls.We used 3 degrees of freedom, a 0.2 R-squared cut-off value and good outcome and healthy controls as the comparator group, respectively.The procedure firstly defines a general regression model for the data and selects significant genes by computing a regression fit for each gene.This function also computes the p-value associated to the F-Statistic of the model, which is used to select significant genes.By default maSigPro corrects this pvalue for multiple comparisons by applying the linear step-up, false discovery rate procedure.Secondly, stepwise regression is applied as a variable selection strategy to study differences between experimental groups and to find statistically significant different profiles.This approach allows for independent observations (different samples at different time points) and for unbalanced designs and heterogeneous sampling times.
We used Ingenuity Pathway Analysis software (QIAGEN) to assess the biological pathways enriched in the differentially expressed genes which fulfilled the criteria of an FDR <.05 and absolute log2 fold change>1 (canonical pathway analysis).Fisher's exact test was used to assess the significance of the canonical pathways and they were then ranked based on their P-values.The IPA regulation z-score algorithm was used to predict the activation state for a given biological function (increase or decrease). ©

eFigure 1 .
Batch effect correctionPrincipal component analysis of HELIX batch 1 (red) and HELIX batch 2 (green) after quantile normalization and ComBat-Seq (LMIC).Each dot represents a patient and is colored according to batch number.Square shapes represent good and triangle shapes represent adverse outcome.
2024 Montaldo P et al.JAMA Network Open.eTable 1. Clinical variables of HIE infants in LMICs batches 1 and 2 Mean (SD) age and mean (SD) body temperature at blood sample collection Clinical characteristics of babies from the HELIX trial (LMIC) with gene expression data and that were included in the analysis, and those without gene expression data a © 2024 Montaldo P et al.JAMA Network Open.eTable 2.a Good quality ribonucleic acid samples were available at 2 h (T0) before the start of hypothermia and subsequently at 26 h (T1), 51 h (T2) and 75 h (T3) after birth from 28, 24, 24 and 28 neonates respectively.bGood quality RNA samples were available at 2 h (T0), 27 h (T1), 51 h (T2) and 72 h (T3) after birth from 13, 12, 11 and 10 neonates.Abbreviations: HIC, high income countries; LMIC, low and middle-income countries.aData are presented as number (percentage) unless otherwise indicated.b Clinical seizures only.